Skip to content

Conversation

@jwaiton
Copy link
Member

@jwaiton jwaiton commented Apr 22, 2025

This PR is an offshoot of include-WD1-processing, and provides the lazy reader() and writer() functions for memory efficient manipulation of data.

Lies on top of #39, so will need to be merged in that order.

This was referenced Apr 22, 2025
@jwaiton jwaiton mentioned this pull request Sep 23, 2025
@jwaiton jwaiton requested a review from bpalmeiro October 18, 2025 12:54
Copy link
Collaborator

@bpalmeiro bpalmeiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good in general, nice work. It's a bit lacking in the text suite. You have 5 condition cases (if I counted right), and none are tested. For the sake of the pressing time situation I could let that fly and come back later to pay all that technical debt

@jwaiton
Copy link
Member Author

jwaiton commented Jan 22, 2026

This PR is mildly outdated as I included significant (almost necessary) improvements in an awaiting branch. I did this to make the PRs more feasible for reviewers but I could merge them together in a separate PR if you wish. Thoughts @bpalmeiro ?

Comment on lines 103 to 105
# sanity check that the output is what you expect it is
for i, sample in enumerate(output_data):
assert sample == overwrite_data[i]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert not np.array_equal(np.array(output_data), np.array(overwrite_data))

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I missing anything?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be assert np.array_equal(np.array(output_data), np.array(overwrite_data)),

but yes I'll change this.

Copy link
Collaborator

@bpalmeiro bpalmeiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good, you can also add two easy ones to test the first checks by creating an empty file and then trying to save data in it later (to check it creates the group when not there)

Comment on lines 103 to 105
# sanity check that the output is what you expect it is
for i, sample in enumerate(output_data):
assert sample == overwrite_data[i]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I missing anything?

Comment on lines 110 to 112
vice versa, will write random data to the end of the prior file
and check that the reader reads out both new and old data
'''
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd add explicit comment along: testing overwrite flag works when false

Copy link
Collaborator

@bpalmeiro bpalmeiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm satisfied with the test, and following our offline conversation, let's get this PR rolling, expecting improvements already in the pipeline. Good job.

jwaiton and others added 5 commits January 22, 2026 16:55
tests cover:
- overwriting capabilities (True and False)
- fixed-size capabilities (works as expected, errors when expected)
@jwaiton jwaiton merged commit 8119eb7 into nu-ZOO:main Jan 22, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants